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3.7 Efficiency of estimators. In this and the following two sections the distribution of 

the data is assumed to belong to a parametric family {Pg, 9 G O}, having densities f{d, x). 

The information inequality or Frechet-Cramer-Rao lower bound, when O is an open 
interval in M and is a differentiable real- valued function on 0, is 

var,(T,) > g'{ef/{nh{e)), 

where IiiO) :— E0{{df{9, x)/d9)^), as was proved in Theorem 2.4.10 under some regularity 
conditions when T„ is an unbiased estimator of g{0). But by Theorem 2.4.15, if log /(6', x) 
is in 0, the lower bound is attained for all 6 only when the family of distributions is 
exponential of order 1 with T{x) equal to the given estimator T^(x) where 
When this is true for one function T(-), the only other functions for which it holds are 
aT{-) + b where a ^ and b are constants. So the only functions having unbiased estimators 
attaining the information inequality lower bound for all $ are ag{9) + b where now a and b 
are any constants and g is the specific function d\ogK{9) / dO for which T is the unbiased 
estimator, by Corollary 2.5.9. Even for exponential families of order 1, unique unbiased, 
admissible estimators (for other functions) may be unsatisfactory, as in the example at the 
end of Sec. 2.5. 

If the information inequality provided best possible lower bounds for mean-square 
errors only for estimating functions ag{9) + b as just described, it would not be very useful. 
There is, however, an asymptotic lower bound, 

(3.7.1) \immfEei[n'/^{Tn-g{9))]') > g'{9f/I{9), 

n—^oo 

where I{9) = Ii{9), which is valid under rather general conditions, without unbiasedness, 
as will be shown here first for g{9) = 9, so g'{9) = 1, in Theorem 3.7.3, then for more 
general g in Theorem 3.7.9. First, though, it will be seen that the bound may no longer 
hold for all 9: 

Example. Let Xi,X2, ... , be i.i.d. with a normal distribution A^(/x, 1). Then the sample 

mean X = (Xi -\ h Xn)/n is an unbiased estimator of jj, which attains the information 

inequality lower bound for all /i. Let T„(Xi, . . . ,X„) := X if \X\ > n~^l'^ and := 
if \X\ < n~^''^. If /i 7^ 0, then the mean-square error of Tn will be asymptotic to 1/n, in 
other words nE^{Tn — n)^ — 1 as n cxd, as for X. But if n — 0, the probability that 
Tn = converges to 1, and nEo{[Tn - 0]^) ^ 0. In fact, for = 0, P{\X\ > n'^l^) = 
p\\Z\ > n^l^) < 2e-^/^ where Z is a iV(0, 1) variable (RAP, LemmaJ.2.1.6), so T„ = 
except with very small probability. Or, one can take Tn = cX for |X| < n-^^ where 
< |c| < 1. Then for = 0, ^/nTn is asymptotically iV(0,c^) where < < 1. 

A sequence of estimators which asymptotically attains the information inequality lower 
bound at a given 9 is called "efficient" at that 9. A sequence with a smaller asymptotic 
variance, like the sequence in the last example at /j, — 0, is called "superefficient" at the 
given 9. 

More complicated examples would show that without increasing the asymptotic vari- 
ance of Tn for any fi, it could be made superefficient at some values of in a finite or 
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countable set. It will be shown under some conditions below that as 9 varies over an 
interval in M, (3.7.1) will hold for almost all 6 for Lebesguc measure. In other words, su- 
pcrcfficicncy can occur at most for ^ in a set of Lebesgue measure 0. First, the assumptions 
will be listed. 

Let {X,B) be a measurable space (sample space). Suppose that the parameter space 
is an open interval in M. Let {Pqi £ ©}, be a family of laws on (X, B), dominated by 
a cr-finite measure p on (X, i3), with as usual f{0,x) := {dPo/dv)(x). Assume that the 
densities can be chosen so that 

(AV-1) There is a set 5 e B such that for aU 9, f{9,x) > for all x e B and f{9,x) = 
for all X ^ B. 

So, {Pe, ^ e 0} is an equivalent family as defined in Sec. 2.4. 

(AV-2) f{9,x) is a function of ^, meaning that its first and second derivatives with 
respect to 9 exist and are continuous at all ^ in 0, for all x. 

For the family of laws Pq = U[9, 9 + 1] on M, there exist (unbiased) estimators of 9 with 
mean-square error of order l/n^ (Sec. 2.4, Problem 3). Thus some regularity conditions 
(equivalence, differentiability in 6) cannot both just be removed. 

Let L{9,x) := log f{6,x). Derivatives with respect to 9 will be denoted by primes, 
so that L'{9, x) := dL{9, x)/d9, etc. Then by (AV-1) and (AV-2), L{9, x) is a function 
of 9 for any x e B. The Fisher information I{9) — Eo{L' {9,xY) as defined in Sec. 2.4. 
Note that if (AV-1) fails and the family is not equivalent, L{9,x) can be — oo on a set of 
X which has Pq probability but positive P<^ probability for some (j)^ 9. 

(AV-3) For aU ^ e 0, the Fisher information I{9) exists with < I{9) < oo, and 
Eg{L'{9,x)) = 0. 

This last equation results if the equation / f{9,x)di'{x) = 1 can be differentiated under 
the integral sign, multiplying and dividing by f{9,x), as noted in Sec. 2.4. 

(AV-4) E0iL"(9,x)) = -1(9) for aU ^. 

The latter equation follows if the differentiation under the integral sign just mentioned can 
be done also for the second derivative. 

(AV-5) For any 6*0 G 0, there is a 5 > and a i3-measurable function M{x) such that 
\L"{9,x)\ < M{x) for all x G X and all 9 with \9 - 9o\ < S, and E0^M{x) < oo. 

Let Xi,X2, ... be a sequence of i.i.d. variables in X with distribution Pg. For n = 
1,2,..., let X" be the set of all ordered n-tuples (xi, . . . , Xn) with Xi E X for each i. Let 
be the product cr-algebra in X"^, i.e. the smallest cr-algebra of subsets of X"^ making each 
coordinate projection (xi, . . . , x^) ^ Xi measurable for i = 1, . . . , n. For any probability 
measure Q on (X, B), let be the law on {X'^,E'^) for which the coordinates are i.i.d. 

Let {Tji}ji>i be a sequence of estimators (statistics), so that for each n, T^^ is mea- 
surable from X" into 0. It will be assumed that the T„ are consistent estimators of ^, at 
least in probability, and are asymptotically normal: 
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(AV-6) For each 6', there is a v{9) with < v{9) < oo such that as n — > oo, the distribution 
of n^/2(r„ - 6*) under converges to N{0, v{e)). 

(AV-6) doesn't allow the example near the beginning of this section of superefRcient 
estimation at with v{0) = 0. To deal with this we could add to the estimator an 
independent variable with distribution A^(0, 6/n) for 6 > and then let 6 decrease to 0. 

The asymptotic normality was proved in Section 3.6 under some conditions. 

Let Pt0 denote probabilities for the distribution where Xi,X2, ■ ■ ■ are i.i.d. with dis- 
tribution Pg. 

If (AV-6) holds, then 

v{9) < liminf Eg{[n^/\Tr,-9)f), 

n^oo 

as follows. For each K < oo, the function min(a;^, K) is bounded and continuous on R, so 
Egm.in{K,n{Tn-ef) J m.in{K,x'^)dN{0,v{9)){x). 

Thus for each K < oo 

liminf Eg{n{Tn - ef) > [ min{K,x^)dN{0,v{e)){x). 

n—^oo J 

Then let K — > oo and apply monotone convergence. 
Thus, assuming asymptotic normality (AV-6), if 

(3.7.2) v{e) > i/i{e) 

holds, then so does (3.7.1) for g{e) = 9. 

The next theorem will give an almost everywhere lower bound on efficiency of esti- 
mators of a 1-dimensional parameter. In the proof there will be a relationship between 
efficiency of estimators and tests (Lemma 3.7.4). Suppose, based on a sample of size n i.i.d. 
Pg, we want to test a hypothesis 9 = 9q against some alternatives 0„ depending on n, with 
a size that converges to some probability a other than or 1. Let (j)„ := 9Q + an for some 
numbers i 0. To have a specific example in mind, suppose that Pg = N{9, 1). Then if 
ttn = o{l/^/n), the power of the tests will converge to a. If 1/^/n = o(an), the power of the 
tests will converge to 1. An interesting case is where an is of the same order of magnitude 
as l/\/n, specifically, a„ = c/^/n for some constant c > 0, giving a sequence of so-called 
"Pitman alternatives." The asymptotic power of a sequence of tests of against (j)n gives 
a measure of the efficiency of the sequence of tests, called Pitman efficiency. We will not 
be dealing explicitly any further with Pitman efficiency, but Lemma 3.7.4 and its proof 
bring tests and the Neyman-Pearson lemma into a proof about efficiency of estimators. 
Under our assumptions. Lemma 3.7.6 will show that for Pitman alternatives, the power 
will converge to a limit larger than a. 

3.7.3 Theorem. Under assumptions (AV-1) through (AV-6), (3.7.2) holds for almost all 
9 in the open interval for Lebesgue measure. 
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Proof. First, the following will be helpful: 

3.7.4 Lemma. Under the assumptions of Theorem 3.7.3, ii Oq E Q and for d{n) := 

liminfPre(„){T„<^(n)} < 1/2, 

n— >oo 

then (3.7.2) holds for 9 = Oq. 

Remark. For a fixed 6*, Pre(v^(T„ - 6*) < 0) ^ 1/2 as n ^ oo by (AV-6). 



Proof. Consider a likelihood ratio (Neyman-Pearson) test of 6*0 against the simple alter- 
native e{n) based on := (Xi, . . . , X^). For any ^ e 6, n and X^"), let 

L,(^,X(-)) := I - and 

:= K„(^o,X(")) := [L„(^(n), X^")) - L„(^o, X^")) + //2]//V2. 

Then is a strictly increasing function (an affine function of the logarithm) of the 
likelihood ratio -R^"^) of -P^(^) to P^. In proving Lemma 3.7.4 another fact will be 
needed: 

3.7.5 Lemma. As n — > 00, the distribution of under Pq^ converges to A'"(0, 1). 
Proof. By (AV-2) and Taylor's theorem with remainder, 

L,(^(n),X(-)) = L,(^o,X(-))+n-i/2LU^o,X(-)) + (2n)-iL;:(0„,X(-)) 

where 9o < (pn < 9{n) and depends on X*^"'), say 4>n '■= 4'n{X^^^). Let := 
n-i|L-(<^„,XH)-L-(^o,X(-))|. 

Claim, — a.s. for as n — > 00. 

Proof of Claim. For any 5 > and x e X", let 

A{x,6) := s\x^{\L"{e,x) - L"{eQ,x)\: 9 e 0, \9-9o\<6}. 

Let m{d) := E0^^A{-, S). (AV-5) implies that m{S) < 00 for 5 small enough. (AV-1) 
and (AV-2) imply that L" is continuous in 9 for all x (in S), and (AV-5) gives dominated 
convergence, so m{6) J, as 5 J, 0. Given £ > 0, take S > such that m{6) < e. Then for 
each n > 5~'^ and all X^"), we have 

Cn < -J]A(X„n-V2) < ij]A(X„5) 
i=i i=i 

since |0n — ^o| < So by the strong law of large numbers, limsup^^^^ Cn < £ a-s. 

Letting £ J, finishes the proof of the Claim. □ 

From (AV-4) and the strong law of large numbers, it follows that n~^L'^(9o, X^")) — > 
— / a.s. as n — > 00. Then from the definition of Kn and the Taylor expansion, 

L„(^(n),X(-))-L,(^o,xH)-4f^;(^o,X(-)) + ^ < + ^ -K{9o, X^^^) + I 
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which converges to a.s. Thus from the definition of Kn, 
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where the O term goes to a.s., so a.s. 

Lemma 3.7.5 then follows from (AV-3), the central limit theorem (RAP, Sec. 9.5), and 
the fact that if Un,Vn are random variables such that \Un — Vn\ — > in probability, while 
the laws of Vn converge to some limit law Q, then Un have the same limiting distribution 
(RAP, Lemma 11.9.4). □ 

Continuing with the proof of Lemma 3.7.4, let $ be the standard normal distribution 
function. Let t be a constant and for each n let Cn := Cn,t '■— {X^'^'> : Kn > t}. The 
next step is: 

3.7.6 Lemma. For any t e M, Pr0,{Cn,t) ^ 1 - $(0 and Pre(n) (Cn,t) ^ 1 - $(t - I^/^) 
as n oo. 

Proof. The first statement is clear from Lemma 3.7.5. For the second, let be the 
distribution function of Kn under Pre^ . Then 

1 -Pre(n)(Cn,t) = PT0^n){Kn<t) = Jj^^^^exp{Ln{d{n) , x))du'' (x) 

= iK^Kt exp[Ln(^(n), x) - Ln(^o, x)]dP^^ {x) 
= !K„<t^Ml^'^Kn)dPl = e-'/'f'_^exp{iy^z)dHniz). 

Let o(l) denote (as always) any term that goes to as n — > oo. It follows from Lemma 3.7.5 
and the Helly-Bray theorem (RAP, Theorem 11.1.2), since z i— > exp(/^/^;2) is bounded and 
continuous on (— oo,t] and $ is continuous, that 

f ex.p{I^/'^z)dHn{z) I exp(/^/2^)(/$(2). 

J —oo J —oo 



Next, we have 



/ exp ( + Viz] dz = [ exp 



{z - y/lf 



dz 



Lemma 3.7.6 now follows. 



□ 
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Now continuing with the proof of Lemma 3.7.4, for each n let 

:= : T,(X(-)) > e{n)}. 

Take any fixed constant t > /^/^ and define Cn '= Cn,t as before. Then by Lemma 3.7.6, 
PT0(^ri){Cn) converges to a hmit less than 1/2, and limsup„^^ Pr5)(„)(D„) > 1/2 by the 
hypothesis of Lemma 3.7.4. So there is a sequence of positive integers, say mi < m2 < . . . , 
such that 

Pre(n)(^n) > Pr(,(„)(Cn) for ?i = mi,m2, . . . . 

For each n, consider C„ and as critical regions for testing the hypothesis 9q against 
the alternative 9{n). Since by the Neyman-Pearson Lemma (Theorem 1.1.3), Cn is an 
admissible critical region, by the statement just before Lemma 3.7.5, but has larger 
power, it must also have larger size: 

Pr0^{Dn) > Pre^{Cn) for n = mi,m2, . . . . 

By (AV-6) and the definitions of 9{n) and D„, recall that by Lemma 3.7.6, FroQ{Cn) 
1 - Let V := ^(^o)- Then 

Pre.iDn) = P^^{Tn > e{n) = Oo+n-^''') = P^^^iTn - Oo) > 1), 

which by (AV-6) converges as n — > oo to 

iV(0,^(^o))([l,+oo)) = iV(0,l)([l/V^,+oo)) = l-$(^-i/2) > l-$(t) 

(via n = ruj), so v'^l"^ < t. Letting t i J^/^, it follows that (3.7.2) holds for 9 = Oq, 
proving Lemma 3.7.4. □ 

Now continuing with the proof of Theorem 3.7.3, for any real 9 let fn{9) '■= \P^e{Tn < 
6*) - ^ I if 6* e G, or otherwise. To see that this is a measurable function of ^, it is enough to 
show that 9 I— > Prg{Tn < 6*) is a measurable function of 6* G O, or that (6*, x) i— Pre(T^ < x) 
is jointly measurable. In x, the function is nondecreasing and left-continuous. For each x 
and each positive integer m let j{x,m) be the largest integer with j{x,m) < 2'^x. Then 
j{x,m)/2'^ T X and PieiTn < j{x/m)/2"') t Pre(T„ < x) for all x and 9. Each i(-,m) is 
clearly a measurable function of x. So it will be enough to show that 9 i— > Pr5i(T„ < y) is 
measurable for each fixed y. This is a special case of the property that the family of laws 
Pre is a measurable family as defined in Sec. 1.2. To see that it is in this case, we have 
that f{9, x) is continuous in 9 by (AV-2). Thus by Fatou's Lemma, for any measurable set 
A C X'^, and any convergent sequence 9k ^ 9 in Q, 

P^{A) := / IV]^,f{9,X,)dv^{X^^^) < hminf / U]^J{9k, X,)du^{X^^^). 

J A J A 

So ^ I— > PgiA) is a lower semicontinuous function: {9 : P^iA) < c} is closed for any real 
c. This implies that 9 i— > Pg{A) is Borel measurable as desired. 
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It follows from (AV-6) that for each 9, PieiTn < ^) ^ 1/2 as n ^ oo. So, < 
fniO) < 1/2 and fn{0) ^ as n oo for all 9. Let gn{9) := fn{9 + rT^I'^) for any 
^ G 0. Then < ^r^ < 1/2 also. We next need another lemma: 

3.7.7 Lemma. There is a set TV of Lebesgue measure and a sequence n\ < ni < . . . 
such that for any 6* G O with 9 ^ N , lim^^oo dnriP) = 0. 

Proof. /"ooSnWdSW = /~oc/n(9 + ii-''")<i«(e) 

as n — s> cxD by dominated convergence since exp(?i~^/^?7) < + 1 for all i]. Convergence in 
implies convergence in probability, which implies that there is a subsequence — > 
almost surely for A'"(0, 1) (RAP, Theorem 9.2.1), and so almost everywhere for Lebesgue 
measure. □ 

The function 9 i— > I{9) is continuous since L"{-,x) is continuous by (AV-1) and (AV- 
2), we can apply (AV-4), and (AV-5) provides domination for the dominated convergence 
theorem. Thus /(■) is Borel measurable. To show that (3.7.2) holds for Lebesgue almost 
all 9 it may be good to know that (3.7.2) holds for ^ in a measurable set, which will follow 
from the next lemma. This lemma will also be applied in the multidimensional case. (AV- 
6) implies that liminf^^oo 'T'-E((T„ — 6')^) > v{9), but the lim inf could be larger than 
v{9)^ so a different approach to it is needed. 

For any distribution function F and < p < 1, the p quantile of F is defined by 
F^{p) := inf {a; : F{x) > p}. Then F^ is a non-decreasing, left-continuous function of 
p. 

3.7.8 Lemma. Under the assumptions (AV-2) and (AV-6), v{9) in (AV-6) is a measurable 
function of 9. 

Proof. Let Fn^e{t) '■— ^"^eiTn < t) for — cxo < t < oo. In the proof of Theorem 
3.7.3, before Lemma 3.7.7, it is shown from (AV-2) that {9,u) i— > Pi0{Tn < u) is jointly 
measurable. Taking u = t + ^ i ^5 it follows that {9^t) ^ Fnfi{t) is jointly measurable. 
Restricting this function to t rational, we have for < p < 1 that 

O^FuAp) = mf{geQ: Fr,M>P} 

is measurable. Then (AV-6) implies that n(F;;g(3/4) - F;;g(l/2))2 ^ ^^{?,/AYv{9) as 
n — > oo where $ is the standard normal distribution function. (Note that $^(1/2) — 0. 
Here $^(3/4)^ = 0.455.) The Lemma follows. □ 

Lemma 3.7.7 implies that liminf^^oo fi'n(^) = for almost all 9. By the definitions, 
this implies that the hypothesis, and so the conclusion, of Lemma 3.7.4 for 9 in place of 
^0 holds for almost all 9, which finishes the proof of Theorem 3.7.3. □ 

Next, Theorem 3.7.3 will be extended to estimators of functions g{9), by the delta- 
method. The factor g'{9)^ is familiar from information inequalities (Section 2.4). Note 
that (AV-1) through (AV-5) don't mention any estimators T„. 
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3.7.9 Theorem. Assume (AV-1) through (AV-5). Let be a function: 9 ^ M. 

Suppose that for each 6 E Q, there is a w{9) > such that for each 9 with g'{0) ^ 0, 
< w{9) < oo and the distribution of y/n{Tn — g{6)) under Pq converges to N{Q,w{0)). 
Then for Lebesgue almost all 6 0, w{e) > g'{ef/I{e). 

Proof. For all ^ G such that g'{9) = 0, the inequality is trivial. Since g is , the set 
where g' ^ Q is, open and thus a countable disjoint union of open intervals. Thus replacing 
by a smaller interval as needed, we can assume that on the interval 0, g'{9) ^ 0, and 
specifically g'{9) > 0. Then g is one-to-one and has a inverse g~^. For a given ^ e 0, 
we have 

g-\g{e) + ci^) = e + {g-'y{g{e))4> + om 

as (j) ^ 0. We have — > g{9) in probability for Prg as n — > oo. Also, {g~^y{g{9)) = 
l/g'{9). Thus as n ^ oo, 

V^{g-\T^)-9) = V^{Tn-g{9))/g'{9)+Opil), 

so the distribution of y/n{g~^{Tn) — 9) converges to N(0,w{9)/g'{9)'^), where one can 
use e.g. RAP, Lemma 11.9.4. Thus (AV-6) holds for the estimators g~^{Tn) of 9 and 
v{9) := w{9)/g'{9f. Then by Theorem 3.7.3, w{9)/g'{9f > l/I{9) for Lebesgue almost 
all ^ e 0, which proves the theorem. □ 

If ^ is a A; X m matrix, then A' denotes its transpose, with {A')ij :— Aji for 
z = 1, . . . , m, J = 1, . . . , A;. In particular, if a; is a row vector (a;i, . . . , Xm) then x' is the 
corresponding column vector, and vice versa. In fact, elements of MJ^ will usually be taken 
as column vectors y, so that y' is the corresponding row vector. Matrix multiplication is 
written by juxtaposition. Thus for x, y G M"^, x'y = X]j=i^j2/j usual dot product 

X ■ y. If x,y E M."^ and C is an m x m matrix, then x'Cy is the number Yl^j=i CijXiyj. 

The Fisher information for a single parameter extends to the Fisher information matrix 
for several parameters, defined as follows. Let be an open set in W^. For 9 := 
{9u... ,9m), let 

(3.7.10) .- Ee^—^^ 

if the partial derivatives exist and have finite variances. In passing, let's note here that 

m 

ds^ := J2 I{9)i3d9id9j 

defines a Riemannian metric on the parameter space which doesn't depend on the choice 
of parametrization. As in the development between (2.4.2) and (2.4.3), alternate forms of 
I-i j are 



m^3 = Ee 



ff dR^,e dR^,e \ \ ^ f 8/(9, x) 0/(9, x) 1 

\\ d<j>i d(l>j J 4>=e) J d9i 09 j f{9,x) ^ 
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Bahadur (1964, p. 1550) pointed out that Theorem 3.7.3 extends to multidimensional 
parameter spaces. Such an extension might be considered non-trivial in view of the Stein 
phenomenon and James-Stein estimators (Section 2.7). But it turns out that the Stein 
phenomenon doesn't affect the asymptotic efficiency as n — > oo. First, multidimensional 
forms of the assumptions (AV-1) through (AV-6) will be given. 

(AC-1) Let be an open set in a Euclidean space MJ^ and let {X,B) be a sample space. 
Let {Pq, ^ e 0} be an equivalent family of laws on {X, B), and let v be an equivalent law, 
e.g. v = Pff, for some fixed 0, with f{9,x) := {dPe/ dv){x)^ and f{9,x) > for all 9 and 

X. 

(AC-2) For all x, f{9,x) is with respect to 6', so that the first and second partial 
derivatives of / with respect to 9 exist and are continuous at all ^ e for all x. 

(AC-3) Let L{9,x) := log/(^,a;). For each 9 E O, the Fisher information matrix I{9) as 
defined by (3.7.10) exists and is strictly positive definite. Also, Eo{'VoL{9,x)) = for the 
gradient of L. 

(AC-4) {Eod^L{9,x)/d9id9j}Y^j^^ = -I{9) for all ^ e 0. 

(AC-5) (AV-5) holds for each of the second partial derivatives d^L{9,x)/d9id9j in place 
of L". 

(AC-6) Tn are estimators of ^ e such that for each 9, the distribution of n^^'^{Tn — 9) 
under Prg converges as n — > cxd to some multivariate normal law N{0,v{9)) where v{9) is 
a nonnegative definite symmetric matrix. 

3.7.11 Theorem. Assume (AC-1) through (AC-6). Then for Lebesgue almost all 6* e 0, 
v{9) — I~^{9) is nonnegative definite. Thus, v{9) is positive definite. 

Proof. We first prove a lemma. 

3.7.12 Lemma. Let C be a symmetric, positive definite real m x m matrix and ^ r] E 
M'". Then 

min{(^'C0 : G M"", r]'(j) = 1} = l/{r]'C-^r]) 
and is attained for = C~^r]/{r]'C~^r]). 

Proof. Since C is positive definite, (^'Ccj) — > -|-oo as |(/)| — > +oo. Thus by the Lagrange 
multiplier method, Appendix F, Theorem F.l and Proposition F.2, i— > (f)'C(p attains its 
minimum on the hyperplane {(p : r]'(p = 1} at some (p, and for each such (p, there is a 
A e K such that V^,x[i'C(j) + X{ri'(j) - 1)] = 0. Thus 2C(j) + Ar/ = 0, so (/> = \C-^r]/2, 
1 = \r(C-^r]l1, A = 1l{r]'C-^r]), and = C-^r]l{r]'C-^r]) as stated, so this gives the 
unique minimum. The value of the minimum is 

c^'CcP) = {C-^r])'r]l{jiC-^r]f = r]' C'^r]/ [r]' C-^r]f = 1/ [r]' C'^r]), 

proving the Lemma. □ 

Now continuing with the proof of Theorem 3.7.11, let's first see what we can infer 
from Theorem 3.7.3 about families with a 1-dimensional parameter. Let C, G and 
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(/) e R"". Since B is open in R^, the real t such that 9 := ( + t(l) e & form 
an open set in M, which is a countable union of open intervals. Consider the family 
with 1-dimensional parameter t for t in such an interval, Qt '■= Qc,<k\^ with densities 
dQt{x)/dv = f(^^^{t, x) := /(C + t(/>, x) for Q + ^eQ. Let ry G M"' satisfy ryV = 1. Then 
+ = ry'C + 1. Under Pr^, r?' V«^[Tn - (C + tcj))] = ^[r]'{Tn - C) - ^] has distribution 
converging as n ^ cxo to N{Ojr)'v{( + t(f))r]). Let K(^^^{t) be the Fisher information of 
the family Qt, a real-valued function of t. Taking the gradient S/0f{9,x), then letting 
— ( + t(j), we get 

(3.7.13) KcAt) = <t>'m<t> 
because 

K^,^{t) - E^+tAid\ogf{C + tct>,x)/dtf] 

By Theorem 3.7.3 applied to the family Qt and to r]'{Tn — as a sequence of estimators 
of t, we get for 9 := C,+t(j) 

(3.7.14) r]'v{9)r] > 1/ [(/>'/ (^)(/)] 

for Lebesgue almost all t such that 9 = ( + t(f> ^ 0. 

Now, supposing heuristically for the moment that (3.7.14) holds for all such that 
r]'(t) = 1, or at least for a countable dense set of such cf), we can take the supremum of the 
right side of (3.7.14) by taking the infimum of the denominator, which by Lemma 3.7.12 
gives 

(3.7.15) r]'v{9)r] > sup{l/[(/)7(6')0] : ryV = 1} = r]'I{9)-^r]. 

This is the conclusion of Theorem 3.7.11 if we can prove it for A'^-almost all 9. 

Returning to the rigorous proof, let Di be a countable dense set in the unit sphere 
gm-i ._ 1^ ^ j^m . _ Yqtc each T] G -Di, let be a countable dense set in the 
hyperplane {(f) G R"^ : rj'cf) — 1}, the same relation between rj and (f) as in (3.7.14) above. 
For each rj e Di, (f) e E^j and C e </>^ := {C e : = 0} such that 9 = C + t(t) eO 
for some real t, we get a one-parameter family Qt as defined above. 

Now 9 1-^ I{9) is continuous from G into M."^ since / is in 9 by (AC-2) and / > 
by (AC-1), so log / is in 9. We use the form of I (9) given by (AC-4), and we have local 
domination by (AC-5), so the dominated convergence theorem applies to give continuity 
of/(-). 

Applying Lemma 3.7.8 to r]'v{9)r] for suitable unit vectors rj, namely the basis vectors 
'~ ^J^d (ci + ej)/v^, i 7^ J, which we can assume are all in Di, we see that 

the matrix elements of v{9) and thus v{9) itself are measurable functions of 9. Thus the 
set of all ^ G O for which (3.7.14) holds for fixed r] and 4> is a measurable set. For 4> fixed 
and C varying in 0-*-, t G R, by the Tonelli-Fubini theorem, we get that (3.7.14) holds for 
Lebesgue almost all ^ G for given r] and (f). 



C+t<p 



(t)'Vef{C + t(/),x] 
f{C + t<t>,x) 
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Taking a countable union of sets of Lebesgue measure 0, we get that for any rj ^ 0, 
for Lebesgue almost all G 0, we get (3.7.15) for e which suffices since E^^ is dense 
in {(f) : (f)'r] = 1}, namely 

r]'v{e)r] > sup{l/[(f)'l{e)(j)] : (p e Er,} = r]'l{e)-^r] 

by Lemma 3.7.12. Taking another countable union over r] G -Di, we get that v{9) — I{9)~^ 
is nonnegative definite for Lebesgue almost all 9 E Q. When this occurs, since I{9)~^ is 
strictly positive definite, so is v{9), proving the Theorem. □ 

NOTES 

The idea that n times the variances of consistent and asymptotically normal estima- 
tors Tn should be asymptotically at least 1/I{9) goes back to Fisher. J. L. Hodges found 
the example as given after (3.7.1) where 1/I{9) is asymptotically attained for all ^ 7^ and 
the variance is smaller (asymptotically vanishing) for = 0, a phenomenon called "superef- 
ficiency." The example was published in LeCam (1953) along with the first statement and 
proof that n- var(T„) is asymptotically bounded below by 1/1(9) for Lebesgue almost all 9. 
Bahadur (1964) stated and proved the version of that fact given in this section. Theorem 
3.7.3. Lehmann (1983, Theorem 6.1.1) gave a statement, but not proof, of Bahadur's theo- 
rem. Theorem 3.7.3 benefited from both Bahadur's and Lehmann's expositions. Lehmann 
points out the step from Theorem 3.7.3 to Theorem 3.7.9 by the delta-method. For another 
proof of the multidimensional Theorem 3.7.11, see van der Vaart (1998), Theorem 8.9. 

REFERENCES 

Bahadur, R. R. (1964). On Fisher's bound for asymptotic variances. Ann. Math. Statist. 
35, 1545-1552. 

Le Cam, Lucien (1953). On some asymptotic properties of maximum likelihood estimates 

and related Bayes' estimates. Univ. Calif. Publ. in Statist. 1, 277-330. 
Lehmann, Erich (1983). Theory of Point Estimation. Wiley, New York, 
van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge University Press. 



11 



